A comparison of Van der Linden's conditional equipercentile equating method with other equating methods under the random groups design
نویسنده
چکیده
To ensure test security and fairness, alternative forms of the same test are administered in practice. However, alternative forms of the same test generally do not have the same test difficulty level, even though alternative test forms are designed to be as parallel as possible. Equating adjusts for differences in difficulties among forms of the test. Six traditional equating methods are considered in this study: equipercentile equating without smoothing, equipercentile equating with pre-smoothing and postsmoothing, IRT true-score and observed-score equatings, and kernel equating. A common feature of all of the traditional procedures is that the end result of equating is a single transformation (or conversion table) that is used for all examinees who take the same test. Van der Linden has proposed conditional equipercentile (or local) equating (CEE) to reduce the error of equating contained in the traditional equating procedures by introducing individual level equating. Van der Linden’s CEE is conceptually closest to IRT-T in that CEE is with respect to a type of true score ( , or proficiency), but it shares similarities with to IRT-O in that CEE uses an estimated observed score distribution for each individual to equate scores using equipercentile equating. No real-data study has yet compared van der Linden’s CEE with each of the traditional equating procedures. Indeed, even for the traditional procedures, no study has compared all six of them simultaneously. In addition to van der Linden’s CEE, two additional variations of CEE are considered: CEE using maximum likelihood (CEE-MLE) and CEE using the true characteristic curve (CEE-TCC). The focus of this study is on comparing results from CEE vis-à-vis the traditional procedures, as opposed to answering a “best-procedure” question, which would require a common conception of “true” equating.
منابع مشابه
Effectiveness of the hybrid Levine equipercentile and modified frequency estimation equating methods under the common-item nonequivalent groups design
The purpose of this study was to evaluate the effectiveness of the hybrid Levine equipercentile (Hybrid LE) and modified frequency estimation (MFE) equating methods in improving accuracy of equating as compared to the percentile rank frequency estimation (FE), kernel frequency estimation (Kernel FE) and percentile rank chained equipercentile (CE) equating methods under the common-item nonequiva...
متن کاملThe Missing Data Assumptions of the Nonequivalent Groups With Anchor Test (NEAT) Design and Their Implications for Test Equating
As part of its nonprofit mission, ETS conducts and disseminates the results of research to advance quality and equity in education and assessment for the benefit of ETS's constituents and the field. To obtain a PDF or a print copy of a report, please visit: Abstract The nonequivalent groups with anchor test (NEAT) design involves missing data that are missing by design. Three popular equating m...
متن کاملA Modified Frequency Estimation Equating Method for the Common- Item Non-Equivalent Groups Design
Frequency estimation (also called post-stratification) is an equating method employed under the common-item nonequivalent groups design. A modified frequency estimation method is proposed here, based on altering one of the traditional assumptions in frequency estimation in order to correct for equating bias. A simulation study was carried out to compare equating errors for the modified frequenc...
متن کاملSelection the best Method of Equating Using Anchor-Test Design in Item Response Theory
Explaining the problem. The equating process is used to compare the scores of the two different tests with the same theme. The goal of this research is finding the best method of equating data using Logistic model. Method. we are using the data of Ph.D. test in Statistic major for two consecutive years 92 and 93. For analyzing, we are specifically using the tests of Statistics major ...
متن کاملDichotomous or polytomous model? equating of testlet-based tests in light of conditional item pair correlations
The performance of dichotomous and polytomous IRT models in equating testletbased tests was compared in this study. To clarify the conditions under which dichotomous and polytomous item response models produce differing results, the DIMTEST program was used for testing essential unidimensionality, and a bias-corrected index (Final Condcorr) was adapted in this study for measuring local item dep...
متن کامل